Robustness to telephone handset distortion in speaker recognition by discriminative feature design
نویسندگان
چکیده
A method is described for designing speaker recognition features that are robust to telephone handset distortion. The approach transforms features such as mel-cepstral features, log spectrum, and prosody-based features with a non-linear arti®cial neural network. The neural network is discriminatively trained to maximize speaker recognition performance speci®cally in the setting of telephone handset mismatch between training and testing. The algorithm requires neither stereo recordings of speech during training nor manual labeling of handset types either in training or testing. Results on the 1998 National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation corpus show relative improvements as high as 28% for the new multilayered perceptron (MLP)-based features as compared to a standard mel-cepstral feature set with cepstral mean subtraction (CMS) and handset-dependent normalizing impostor models. Ó 2000 Elsevier Science B.V. All rights reserved.
منابع مشابه
Stochastic Feature Transformation with Divergence-Based Out-of-Handset Rejection for Robust Speaker Verification
The performance of telephone-based speaker verification systems can be severely degraded by linear and non-linear acoustic distortion caused by telephone handsets. This paper proposes to combine a handset selector with stochastic feature transformation to reduce the distortion. Specifically, a GMMbased handset selector is trained to identify the most likely handset used by the claimants, and th...
متن کاملHandset-dependent background models for robust text-independent speaker recognition
This paper studies the e ects of handset distortion on telephone-based speaker recognition performance, resulting in the following observations: (1) the major factor in speaker recognition errors is whether the handset type (e.g., electret, carbon) is di erent across training and testing, not whether the telephone lines are mismatched, (2) the distribution of speaker recognition scores for true...
متن کاملCluster-Dependent Feature Transformation for Telephone-Based Speaker Verification
This paper presents a cluster-based feature transformation technique for telephone-based speaker verification when labels of the handset types are not available during the training phase. The technique combines a cluster selector with cluster-dependent feature transformations to reduce the acoustic mismatches among different handsets. Specifically, a GMM-based cluster selector is trained to ide...
متن کاملSun-Yuan Kung, Speaker Verification from Coded Telephone Speech Using Stochastic Feature Transformation and Handset Identification
A handset compensation technique for speaker verification from coded telephone speech is proposed. The proposed technique combines handset selectors with stochastic feature transformation to reduce the acoustic mismatch between different handsets and different speech coders. Coder-dependent GMM-based handset selectors are trained to identify the most likely handset used by the claimants. Stocha...
متن کاملFactor analysis of acoustic features using a mixture of probabilistic principal component analyzers for robust speaker verification
Robustness due to mismatched train/test conditions is one of the biggest challenges facing speaker recognition today, with transmission channel/handset and additive noise distortion being the most prominent factors. One limitation of the recent speaker recognition systems is that they are based on a latent factor analysis modeling of the GMM mean super-vectors alone. Motivated by the covariance...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 31 شماره
صفحات -
تاریخ انتشار 2000